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The goal of the research described here is to develop a multistrategy classifier system that 
can be used for document categorization. The system automatically discovers classification 
patterns by applying several empirical learning methods to different representations for 
preclassified documents belonging to an imbalanced sample. The learners work in a parallel 
manner, where each learner carries out its own feature selection based on evolutionary 
techniques and then obtains a classification mode ... 
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Clustering is the unsupervised classification of patterns (observations, data items, or 
feature vectors) into groups (clusters). The clustering problem has been addressed in many 
contexts and by researchers in many disciplines; this reflects its broad appeal and 
usefulness as one of the steps in exploratory data analysis. However, clustering is a difficult 
problem combinatorially, and differences in assumptions and contexts in different 
communities has made the transfer of useful generic co ... 
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Full text available: ^j| pdf(4.28 MB) Additional Information: full citation , abstract , references , index terms 

As one of the most successful applications of image analysis and understanding, face 
recognition has recently received significant attention, especially during the past several 
years. At least two reasons account for this trend: the first is the wide range of commercial 
and law enforcement applications, and the second is the availability of feasible technologies 
after 30 years of research. Even though current machine recognition systems have reached 
a certain level of maturity, their success is ... 

Keywords: Face recognition, person identification 
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Full text available:^ pdf(677.93 KB) Additional Information: full citation , abstract , references , index terms 

Classification of 3-D head models based on their shape attributes for subsequent indexing 
and retrieval are important in many applications, as in the selection and generation of 
human characters in virtual scenes, and the composition of morphing sequences requiring a 
qualitatively similar target head model. Simple feature representations are more efficient 
but may not be adequate for distinguishing the subtly different head model classes. In view 
of these, we propose an optimization approach bas ... 

Keywords: 3D head model, evolutionary computation, genetic algorithm, multiple classifier 
system, pattern classification 
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Both ranking functions and user queries are very important factors affecting a search 
engine's performance. Prior research has looked at how to improve ad-hoc retrieval 
performance for existing queries while tuning the ranking function, or modify and expand 
user queries using a fixed ranking scheme using blind feedback. However, almost no 
research has looked at how to combine ranking function tuning and blind feedback together 
to improve ad-hoc retrieval performance. In this paper, we look at th ... 

Keywords: blind feedback, genetic programming, information retrieval, intelligent 
information retrieval, query expansion, ranking function, search engine 
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March 2004 Proceedings of the eighth annual international conference on 
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Full text available: ^ p df (990.75 KB) Additional Information: full citation , abstract , references , index terms 

In nature, one finds large collections of different protein sequences exhibiting roughly the 
same three-dimensional structure, and this observation underpins the study of structural 
protein families. In studying such families at a global level, a natural question to ask is how 
close to "optimal" the native sequences are in terms of their energy. We therefore define 
and compute the evolutionary capacity of a protein structure as the total number of 
sequences whose energy in the structure i ... 

Keywords: approximate counting, evolutionary networks, protein structure, rapidly mixing 
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Full text available: ^ pdf(231.61 KB) Additional Information: full citation , abstract , references , index terms 

Research has shown that most users' online information searches are suboptimal. Query 
optimization based on a relevance feedback or genetic algorithm using dynamic query 
contexts can help casual users search the Internet. These algorithms can draw on implicit 
user feedback based on the surrounding links and text in a search engine result set to 
expand user queries with a variable number of keywords in two manners. Positive 
expansion adds terms to a user ! s keywords with a Boolean "and," negative ... 

Keywords: Information retrieval, Internet, automatic query expansion, genetic algorithm, 
implicit user feedback, personalization, relevance feedback 
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May 2001 Proceedings of the fifth international conference on Autonomous agents 

Full text available: ^| pdf(240.53 KB) Additional Information: full citation , abstract , references , index terms 

In this paper we describe our neurogenetic approach to developing a multi- agent decision 
support system which assists users in gathering, merging, analyzing, and using information 
to assess risks and make recommendations in situations that may require tremendous 
amounts of time and attention of the users. In Phase I of this project, called the EMMA 
project, we demonstrated the feasibility of a set of solutions to various problems by building 
an intelligent agent application that makes reco ... 

Keywords: adaptation and learning, agent communication languages, evolution of agents, 
information agents, multi-agent communication/collaboration 
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December 2003 The Journal of Machine Learning Research, volume 4 

Full text available:^ pdf (300.28 KB) Additional Information: full citation , abstract , citing s, index terms 

Machine learning strongly relies on the covering test to assess whether a candidate 
hypothesis covers training examples. The present paper investigates learning relational 
concepts from examples, termed <em>relational learning</em> or <em>inductive logic 
programming</em>. In particular, it investigates the chances of success and the 
computational cost of relational learning, which appears to be severely affected by the 
presence of a phase transition in the covering test. ... 
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This study reports the results of using minimum description length (MDL) analysis to model 
unsupervised learning of the morphological segmentation of European languages, using 
corpora ranging in size from 5,000 words to 500,000 words. We develop a set of heuristics 
that rapidly develop a probabilistic morphological grammar, and use MDL as our primary 
tool to determine whether the modifications proposed by the heuristics will be adopted or 
not. The resulting grammar matches well the analysis that ... 

17 A QoS-Provisioning neural fuzzy connection admission controller for multimedia high- 
s peed networks 

Ray-Guang Cheng, Chung-Ju Chang, Li-Fong Lin 

February 1999 IEEE/ ACM Transactions on Networking (TON), volume 7 issue l 

Full text available: ^ pdf(342.90 KB) Additional Information: full citation , references , citings , index terms 



18 Learning methods to combine linguistic indicators: improving aspectual classification | 
and revealin g lin g uistic insights 
Eric V. Siegel, Kathleen R. McKeown 

December 2000 Computational Linguistics, volume 26 issue 4 

Full text available:^ [fjj] 

H|pQT(i.vJb Mb), 1 ^ Additional Information: full citation , abstract , references 

Publisher Site 

Aspectual classification maps verbs to a small set of primitive categories in order to reason 
about time. This classification is necessary for interpreting temporal modifiers and assessing 
temporal relationships, and is therefore a required component for many natural language 
applications. A verb's aspectual category can be predicted by co-occurrence frequencies 
between the verb and certain linguistic modifiers. These frequency measures, called 
linguistic indicators, are chosen by linguistic insi ... 



1 9 Learnin g evaluation functions to improve optimization by local search Q 
Justin Boyan, Andrew W. Moore 

September 2001 The Journal of Machine Learning Research, volume l 
Full text available: ^ pdf(643.21 KB) Additional Information: full citation , abstract 

This paper describes algorithms that learn to improve search performance on large-scale 
optimization tasks. The main algorithm, STAGE, works by learning an evaluation function 
that predicts the outcome of a local search algorithm, such as hillclimbing or Walksat, from 
features of states visited during search. The learned evaluation function is then used to bias 
future search trajectories toward better optima on the same problem. Another algorithm, X- 
STAGE, transfers previously learned evaluation ... 

20 Finknn: a fuzzy interval number k-nearest neighbor classifier for prediction of sugar Q 

production from populations of samples 
Vassilios Petridis, Vassilis G. Kaburlasos 

December 2003 The Journal of Machine Learning Research, volume 4 

Full text available: ^ pdf(360.76 KB) Additional Information: full citation , abstract , index terms 

This work introduces FINkNN, a k-nearest-neighbor classifier operating over the metric 
lattice of conventional interval -supported convex fuzzy sets. We show that for problems 
involving populations of measurements, data can be represented by fuzzy interval numbers 
(FINs) and we present an algorithm for constructing FINs from such populations. We then 
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21 IS '97: model curricul um and g uidelines for undergraduate de g ree programs in 
information systems 

Gordon B. Davis, John T. Gorgone, J. Daniel Couger, David L. Feinstein, Herbert E. 
Longenecker 

December 1997 ACM SIGMIS Database , Guidelines for undergraduate degree programs 
on Model curriculum and guidelines for undergraduate degree 
programs in information systems, volume 28 issue i 

Full text available: 1|| pdf(7.24 MB) Additional Information: full citation , citings 



22 Writing the web: Mining topic-specific concepts and definitions on the web 
Bing Liu, Chee Wee Chin, Hwee Tou Ng 

May 2003 Proceedings of the twelfth international conference on World Wide Web 

Full text available* fS| pdf(245 66 KB) Additional Information: full citation , abstract , references , citings , index 
^ terms 

Traditionally, when one wants to learn about a particular topic, one reads a book or a 
survey paper. With the rapid expansion of the Web, learning in-depth knowledge about a 
topic from the Web is becoming increasingly important and popular. This is also due to the 
Web's convenience and its richness of information. In many cases, learning from the Web 
may even be essential because in our fast changing world, emerging topics appear 
constantly and rapidly. There is often not enough time for someone ... 

Keywords: definition mining, domain concept mining, information integration/ knowledge 
compilation, web content mining 



23 Modeling dependencies in protein-DNA binding sites 
Yoseph Barash, Gal Elidan, Nir Friedman, Tommy Kaplan 

April 2003 Proceedings of the seventh annual international conference on 
Computational molecular biology 

Full text available- jjjjjl pdf(41 1 .94 KB) Additional Information: full citation , abstract , references , citings , index 
' ^ ! terms 

The availability of whole genome sequences and high-throughput genomic assays opens the 
door for in silico analysis of transcription regulation. This includes methods for discovering 
and characterizing the binding sites of DNA-binding proteins, such as transcription factors. 
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A common representation of transcription factor binding sites is a position specific score 
matrix (PSSM). This representation makes the strong assumption that binding site positions 
are independent of each othe ... 

Keywords: DNA sequence motifs, bayesian networks, factors binding sites, transcription 



24 Using a mixture of probabilistic decision trees for direct prediction of protein function 
Umar Syed, Golan Yona 

April 2003 Proceedings of the seventh annual international conference on 
Computational molecular biology 

Full text available: ^ pdf(3Q6.22 KB) Additional Information: full citation , abstract , references , index terms 

We study the direct relationship between basic protein properties and their function. Our 
goal is to develop a new tool for functional prediction that can be used to complement and 
support other techniques based on sequence or structure information. In order to define 
this new measure of similarity between proteins we collected a set of 453 features and 
properties that characterize proteins and are believed to be correlated and related to 
structural and functional aspects of proteins. Among thes ... 

Keywords: decision trees, functional prediction, sequence-function relationships 



25 Context-specific Bayesian clustering for gene expression data 
Yoseph Barash, Nir Friedman 

April 2001 Proceedings of the fifth annual international conference on Computational 
biology 

p ii , . . , n . f/000 00 ,, m Additional Information: full citation , abstract , references , citings , index 

Full text available: 18a pdf(233.32 KB) 

LSJ " terms 

The recent growth in genomic data and measurement of genome-wide expression patterns 
allows to examine gene regulation by transcription factors using computational tools. In this 
work, we present a class of mathematical models that help in understanding the 
connections between transcription factors and functional classes of genes based on genetic 
and genomic data. These models represent the joint distribution of transcription factor 
binding sites and of expression levels of a gene in a single ... 

26 Machine learning in automated text cate g orization 
Fabrizio Sebastiani 

March 2002 ACM Computing Surveys (CSUR), volume 34 issue l 

Full text available* 1sH Ddf(524 41 KB) Additional Information: full citation , abstract , references , citings , index 
u e avai a e.-j^a-J : terms 

The automated categorization (or classification) of texts into predefined categories has 
witnessed a booming interest in the last 10 years, due to the increased availability of 
documents in digital form and the ensuing need to organize them. In the research 
community the dominant approach to this problem is based on machine learning 
techniques: a general inductive process automatically builds a classifier by learning, from a 
set of preclassified documents, the characteristics of the categories. ... 

Keywords: Machine learning, text categorization, text classification 



27 Artificial life: an opportunity to include research in the computer science curriculum 
Gloria Childress Townsend, Wade Hazel 

October 2001 Journal of Computing Sciences in Colleges, volume 17 issue l 
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Full text available: ^ pdf(45.38 KB) Additional Information: full citation , abstract , references , index terms 

Student research and interdisciplinary work persist as desirable components of the 
undergraduate computer science experience; yet, few schools can afford to allocate scarce 
resources to these endeavors. The course described in this paper incorporates both student 
research and interdisciplinary work with minimal expenditure of instructor time. This is 
accomplished, in part, by a biologist and a computer scientist sharing instructor duties. 
Students receive journal articles pertaining to evolution ... 

28 Test and diagnosis for complex designs: E nhancing diagnosis resolution for delay 
defects based upon statistical timing and statistical fault models 

A. Krstic, L-C. Wang, K.-T. Cheng, J. -J. Liou, T. M. Mak 

June 2003 Proceedings of the 40th conference on Design automation 

Full text available: ^ pdf(101 .12 KB) Additional Information: full citation , abstract , references , index terms 

In this paper, we propose a new methodology for diagnosis of delay defects in the deep sub 
micron domain. The key difference between our diagnosis framework and other traditional 
diagnosis methods lies in our assumptions of the statistical circuit timing and the statistical 
delay defect size. Due to the statistical nature of the problem, achieving 100% diagnosis 
resolution cannot be guaranteed. To enhance diagnosis resolution, we propose a 3-phase 
diagnosis methodology. In the first phase, our g ... 

Keywords: delay ATPG, delay fault diagnosis, statistical timing models 

29 Computational models: Biologically inspired rule-based multiset programming 
paradigm for soft- computing 

E. V. Krishnamurthy, V. K. Murthy, Vikram Krishnamurthy 

April 2004 Proceedings of the first conference on computing frontiers on Computing 
frontiers 

Full text available: ^ pdf (289.68 KB) Additional Information: full citation , abstract , r eferences , index terms 

This paper describes a rule-based multiset programming paradigm, as a unifying theme for 
biological, chemical, DNA, physical and molecular computations. The computations are 
interpreted as the outcome arising out of deterministic, nondeterministic or stochastic 
interaction among elements in a multiset object space which includes the environment. 
These interactions are like chemical reactions and the evolution of the multiset can mimic 
the biological evolution. Since the reaction rules are inhere ... 

Keywords: DNA, biologically-inspired paradigm, closed and open systems, first and second 
order logic, genetic and molecular computing, probabilistic rule based paradigm, soft 
computing 



30 SIGSAM BULLETIN: Computer algebra in the life sciences 
Michael P. Barnett 

December 2002 ACM SIGSAM Bulletin, volume 36 issue 4 

Full text available: ^j| pdf(240.15 KB) Additional Information: full citation , abstract , references 

This note (1) provides references to recent work that applies computer algebra (CA) to the 
life sciences, (2) cites literature that explains the biological background of each application, 
(3) states the mathematical methods that are used, (4) mentions the benefits of CA, and 
(5) suggests some topics for future work. 

31 Developing a generic genetic algorithm 
Melvin Neville, Anaika Sibley 
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December 2002 Proceedings of the 2002 annual ACM SIGAda international conference 
on Ada: The engineering of correct and reliable software for real-time 
& distributed systems using Ada and related technologies 

Full text available: ^ pdf(240.84 KB) Additional Information: full citation , abstract , references , index terms 

Genetic and evolutionary algorithms, inspired by biological processes, provide a technique 
for programs to "automatically" improve their parameters. We discuss the basics of the 
algorithms and introduce our own hybrid. The development of this hybrid and its application 
to a simplified problem, evolving the coefficients for the sine function in a Taylor series, 
presents opportunities for computer science education with respect to model-building, data 
structures, and language features. Students mu ... 

Keywords: Ada education, artificial intelligence, data structures, evolutionary algorithm, 
generics, genetic algorithm, software tools, teaching, templates 



32 Research p a pers: data/knowledge management: Reexamining tf.idf based information Q 
retrieval with genetic programming 
Nir Oren 

September 2002 Proceedings of the 2002 annual research conference of the South 

African institute of computer scientists and information technologists 
on Enablement through technology 

Full text available: ^ pdf(1 80.72 KB) Additional Information: full citation , abstract , references , index terms 

The tf.idf family of vector based information retrieval schemes is very popular due to its 
simplicity and robustness, as well as its tractability to enhancements. This paper proposes a 
method to automatically perform a search for new tf.idf like schemes using genetic 
programming. The results of this automated search are then evaluated in a simple usage 
scenario. Also evaluated are the effects of using different fitness functions in the genetic 
programming phase. 

Keywords: genetic programming, information retrieval, tf.idf 



33 Reports from KDD-2001: KDD Cup 2001 report 

Jie Cheng, Christos Hatzis, Hisashi Hayashi, Mark-A. Krogel, Shinichi Morishita, David Page, 
Jun Sese 

January 2002 ACM SIGKDD Explorations Newsletter, volume 3 issue 2 

Full text available: ^ pdf(1.96 MB) Additional Information: full citation , abstract , references , citing s 

This paper presents results and lessons from KDD Cup 2001. KDD Cup 2001 focused on 
mining biological databases. It involved three cutting-edge tasks related to drug design and 
genomics. 

Keywords: Competition, biology, drug design, genomics 



34 Special issue on ICML: Learning probabilistic models of link structure Q 
Lisa Getoor, Nir Friedman, Daphne Koller, Benjamin Taskar 
March 2003 The Journal of Machine Learning Research, volume 3 

Full text available: ^ pdf(479.67 KB) Additional Information: full citation , abstract , index terms 

Most real-world data is heterogeneous and richly interconnected. Examples include the 
Web, hypertext, bibliometric data and social networks. In contrast, most statistical learning 
methods work with "flat" data representations, forcing us to convert our data into a form 
that loses much of the link structure. The recently introduced framework of probabilistic 
relational models (PRMs) embraces the object-relational nature of structured data by 
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capturing probabilistic interactions between att ... 

35 Computin g curricula 2001 

September 2001 Journal on Educational Resources in Computing (JERIC) 

Full text available: fg| pdf(61 3.63 KB) AJJ . 4 . ... < ^ * 

\_ L \, ' ~ Additional Information: full citation , references , citings , index terms 

ntml(2.7o KB) 



36 An updated survey of GA-based multiob j ective optimization techniques Q 
Carlos A. Coello 

June 2000 ACM Computing Surveys (CSUR), volume 32 issue 2 

Full text available* ISl odf(250 77 KB) Additiona ' Information: full citation , abstract , references , citings , index 

terms 

After using evolutionary techniques for single-objective optimization during more than two 
decades, the incorporation of more than one objective in the fitness function has finally 
become a popular area of research. As a consequence, many new evolutionary-based 
approaches and variations of existing techniques have recently been published in the 
technical literature. The purpose of this paper is to summarize and organize the information 
on these current approaches, emphasizing the importanc ... 

Keywords: artificial intelligence, genetic algorithms, multicriteria optimization, 
multiobjective optimization, vector optimization 

37 Risk analysis: New simulation methodology for risk a nalysis: genetic pro gramming with Q 
monte carlo simulation for option pricing 

N. K. Chidambaran 

December 2003 Proceedings of the 35th conference on Winter simulation: driving 
innovation 

Full text available: pdf(351 .90 KB) Additional Information: full citation , abstract , references 

I examine the role of programming parameters in determining the accuracy of Genetic 
Programming for option pricing. I use Monte Carlo simulations to generate stock and option 
price data needed to develop a Genetic Option Pricing Program. I simulate data for two 
different stock price processes - a Geometric Brownian process and a Jump-Diffusion 
process. In the jump-diffusion setting, I seed the Genetic Program with the Black-Scholes 
equation as a starting approximation. I find that population ... 

38 A method for optimal design of a threading scoring function Q 
Jadwiga R. Bierikowska, Robert G. Rogers, Temple F. Smith 

April 1999 Proceedings of the third annual international conference on Computational 
molecular biology 

Full text available: ^pdf (1.13 MB) Additional Information: full citation , references , index terms 



39 Information access and retrieval: Usin g g enetic algorithms to find suboptimal retrieval Q 
expert combinations 

Holger Billhardt, Daniel Borrajo, Victor Maojo 

March 2002 Proceedings of the 2002 ACM symposium on Applied computing 

Full text available: ^j |pdf(613.27 KB) Additional Information: full citation , abstract , references , index terms 

A common problem of expert combination approaches in Information Retrieval (IR) is the 
selection of both, the experts to be combined and the combination function. In most studies 
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the experts are selected from a rather small set of candidates using some heuristics. Thus, 
only a reduced number of possible combinations is considered and other possibly better 
solutions are left out. In this paper we propose the use of genetic algorithms to find a 
suboptimal combination of experts for a document coll ... 

Keywords: data fusion, genetic algorithms, information retrieval 



40 A class of edit kernels for SVMs to predict translation initiation sites in eukaryotic 
mRNAs 

Haifeng Li, Tao Jiang 

March 2004 Proceedings of the eighth annual international conference on 
Computational molecular biology 

Full text available: ^ pdf(151.57 KB) Additional Information: full citation , abstract , references , index terms 

The prediction of translation initiation sites (TISs) in eukaryotic mRNAs has been a 
challenging problem in computational molecular biology. In this paper, we present a new 
algorithm to recognize TISs with a very high accuracy. Our algorithm includes two novel 
ideas. First, we introduce a class of new sequence-similarity kernels based on string edit, 
called the edit kernels, for use with support vector machines (SVMs) in a discriminative 
approach to predict TISs. The edit ke ... 

Keywords: edit distance, mRNA, machine learning, positive definite kernel, support vector 
machine, translation initiation site 
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41 VLSI cell placement techniq ues 
K. Shahookar, P. Mazumder 

June 1991 ACM Computing Surveys (CSUR), volume 23 issue 2 

Additional Information: full citation, abstract , references , citings , index 
terms , review 



Full text available: ffl pdf (5.28 MB) 



VLSI cell placement problem is known to be NP complete. A wide repertoire of heuristic 
algorithms exists in the literature for efficiently arranging the logic cells on a VLSI chip. The 
objective of this paper is to present a comprehensive survey of the various cell placement 
techniques, with emphasis on standard cell and macro placement. Five major algorithms for 
placement are discussed: simulated annealing, force-directed placement, min-cut 
placement, placement by numerical optimization, a ... 

Keywords: VLSI, floor planning, force-directed placement, gate array, genetic algorithm, 
integrated circuits, layout, min-cut, physical design, placement, simulated annealing, 
standard cell 



42 Finding short DNA motifs using permuted markov models | 
Xiaoyue Zhao, Haiyan Huang, Terence P. Speed 

March 2004 Proceedings of the eighth annual international conference on 
Computational molecular biology 

Full text available: ^ pdf(208.95 KB) Additional Information: full citation , abstract , references , index terms 

Many short DNA motifs such as transcription factor binding sites (TFBS) and splice sites 
exhibit strong local as well as non-local dependence. We introduce permuted variable length 
Markov models (PVLMM) which could capture the potentially important dependencies among 
positions, and apply them to the problem of detecting splice and TFB sites. They have been 
satisfactory from the viewpoint of prediction performance, and also give ready biological 
interpretations of the sequence dependence observed ... 

Keywords: Jeffreys mixture, maximal dependence decomposition, model selection, 
permuted variable length Markov models, splice sites, transcription factor binding sites, 
weight matrix models 
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An empirical study of non-binary genetic algorithm-based neural approaches for 
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Parag C. Pendharkar, James A. Rodger 

January 1999 Proceeding of the 20th international conference on Information Systems 

Full text available: ^pdf( 1 92.45 KB ) Additional Information: full citation , references , citings, index terms 



44 Manufacturing applications: Simulation optimization in manufacturing analysis: a Q 
simulation-optimization approach using genetic search for supplier selection 

Hongwei Ding, Lyes Benyoucef, Xiaolan Xie 

December 2003 Proceedings of the 35th conference on Winter simulation: driving 
innovation 

Full text available: ^ pdf (339.90 KB) Additional Information: full citation , abstract , references 

The paper presents a simulation-optimization approach using genetic algorithm to the 
supplier selection problem. The problem consists in selecting a portfolio of suppliers from a 
set of pre-selected candidates. The supplier selection is a multi-criteria problem that 
includes both qualitative and quantitative criteria. In order to select the best suppliers it is 
crucial to make a trade off between these tangible and intangible criteria, some of which 
may be contradictory. The proposed approach ... 

45 Using ge netic al g orithms to inductively reason with cases in the le g al domain Q 
Anandeep S. Pannu 

May 1995 Proceedings of the fifth international conference on Artificial intelligence 
and law 

Full text available: ^| pdf(972.17 KB) Additional Information: full citation , references , citings , index terms 



46 A maximum entropy approach to species distribution modeling | 
Steven J. Phillips, Miroslav Dudik, Robert E. Schapire 

July 2004 Twenty-first international conference on Machine learning 

Full text available: ^j| pdf(163.78 KB) Additional Information: full citation , abstract , references 

We study the problem of modeling species geographic distributions, a critical problem in 
conservation biology. We propose the use of maximum-entropy techniques for this problem, 
specifically, sequential-update algorithms that can handle a very large number of features. 
We describe experiments comparing maxent with a standard distribution-modeling tool, 
called GARP, on a dataset containing observation data for North American breeding birds. 
We also study how well maxent performs as a function of ... 

47 Evolutionary computin g and optimization: Issues in parallelizing multiobjective | 
evolutionary algorithms for real world a p plications 

David A. Van Veldhuizen, Jesse B. Zydallis, Gary B. Lamont 

March 2002 Proceedings of the 2002 ACM symposium on Applied computing 

Full text available: ^ pdf(862.18 KB) Additional Information: full citation , abstract , references , index terms 

The concepts of efficiency and effectiveness must be addressed in conducting research into 
using a Evolutionary Algorithm (EA) for optimization problems. The increased use of 
evolutionary approaches for real-world applications, containing multiple objectives and high 
dimensionality, has led to the design and generation of a number of Multiobjective 
Evolutionary Algorithms (MOEA). When analyzing these algorithms, the issues of 
effectiveness and efficiency are extremely important and typically dri ... 

Keywords: multiobjective evolutionary algorithm, parallel algorithms 
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48 Combining phylogenetic and hidden Markov models in biosequence analysis 
Adam Siepel, David Haussler 

April 2003 Proceedings of the seventh annual international conference on 
Computational molecular biology 

Full text available- IS pdf(294 94 KB) Additional Information: full citation , abstract , references , citings , index 



terms 

A few models have appeared in recent years that consider not only the way substitutions 
occur through evolutionary history at each site of a genome, but also the way the process 
changes from one site to the next. These models combine phylogenetic models of molecular 
evolution, which apply to individual sites, and hidden Markov models, which allow for 
changes from site to site. Besides improving the realism of ordinary phylogenetic models, 
they are potentially very powerful tools for inference an ... 

Keywords: context-sensitive substitution, maximum likelihood 



49 Contributed articles: Genetic subtyping using cluster analysis 
Tom Burr, James R. Gattiker, Greggory S. LaBerge 
July 2001 ACM SIGKDD Explorations Newsletter, Volume 3 issue i 

Full text available: ^ pdf(984.40 KB) Additional Information: full citation, abstract, references 

In this paper we (1) describe state-of-the-art methods to identify clusters in DNA sequence 
data for taxonomic analysis; (2) describe a new method with better scaling properties 
based on model-based clustering, and (3) present examples using the nucleoprotein and 
hemagglutin regions of influenza and the env and gag regions of human immunodeficiency 
virus (HIV). 

Keywords: DNA sequence analysis, HIV, influenza, model-based clustering, phylogenetic 
trees 



50 Genetic algorithm for fuzzy modeling of robotic manipulators Q 
Trung T. Pham 

February 1996 Proceedings of the 1996 ACM symposium on Applied Computing 

Full text available: ^ pdf(440.80 KB) Additional Information: full citation , references , index terms 



51 A pplyin g online gradient descent search to g enetic programming for object recognition Q 
Will Smart, Mengjie Zhang 

January 2004 Proceedings of the second workshop on Australasian information 
security, Data Mining and Web Intelligence, and Software 
Internationalisation - Volume 32 

Full text available: ^ pdf(207.11 KB) Additional Information: full citation , abstract , references 

This paper describes an approach to the use of gradient descent search in genetic 
programming (GP) for object classification problems. In this approach, pixel statistics are 
used to form the feature terminals and a random generator produces numeric terminals. 
The four arithmetic operators and a conditional operator form the function set and the 
classification accuracy is used as the fitness function. In particular, gradient descent search 
is introduced to the GP mechanism and is embedded into th ... 

Keywords: data mining, genetic programming, machine learning, object classification 
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52 Evaluation of prediction models for marketing campaigns 
Saharon Rosset, Einat Neumann, Uri Eick, Nurit Vatnik, Izhak Idan 
August 2001 Proceedings of the seventh ACM SIGKDD international conference on 
Knowledge discovery and data mining 

Full text available* 9 Ddf(634 58 KB) Additiona l Information: full citation , abstract , references , citings , index 
u v i .TgJJLJ : terms 

We consider prediction-model evaluation in the context of marketing-campaign planning. In 
order to evaluate and compare models with specific campaign objectives in mind, we need 
to concentrate our attention on the appropriate evaluation-criteria. These should portray 
the model's ability to score accurately and to identify the relevant target population. In this 
paper we discuss some applicable model-evaluation and selection criteria, their relevance 
for campaign planning, their robustness under ... 

Keywords: Confidence Intervals, Marketing Campaigns, Model Evaluation, Performance 
Measures 



53 Use of g enetic algorithms for o ptimization in di gital control of dynamic systems | 
Rajeshwar Prasad Srivastava 

April 1992 Proceedings of the 1992 ACM annual conference on Communications 

Full text available: ^ pdf(494.85 KB) Additional Information: full citation , abstract , references , index terms 

This paper presents a method to optimize proportional-integral-derivative (PID) control 
parameters, given a discrete model of the controlled process. This method is based on 
Holland's genetic algorithm (GA). It does not require a mathematical model of the controller 
to represent its dynamic behavior. It gives a solution that is not only optimal but also meets 
engineering constraints. Genetic algorithms do a global search without derivatives for points 
in a multi-dimensional search space. Th ... 

54 Meta o ptimization: improving compiler heuristics with machine learning | 
Mark Stephenson, Saman Amarasinghe, Martin Martin, Una-May O'Reilly 

May 2003 ACM SIGPLAN Notices , Proceedings of the ACM SIGPLAN 2003 conference 
on Programming language design and implementation, volume 38 issue 5 

Full text available* S odf(302 23 KB) Additional Information: full citation , abstract , references , citings, index 
! terms 

Compiler writers have crafted many heuristics over the years to approximately solve NP- 
hard problems efficiently. Finding a heuristic that performs well on a broad range of 
applications is a tedious and difficult process. This paper introduces Meta Optimization, a 
methodology for automatically fine-tuning compiler heuristics. Meta Optimization uses 
machine-learning techniques to automatically search the space of compiler heuristics. Our 
techniques reduce compiler design complexity by relieving c ... 

Keywords: compiler heuristics, genetic programming, machine learning, priority functions 
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July 2002 Proceedings of the eighth ACM SIGKDD international conference on 
Knowledge discovery and data mining 

Full text available: « pdf(853.58 KB ) Additional Information: full citation , abstract, references , citings, index 
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In this paper we propose a scaling-up method that is applicable to essentially any induction 
algorithm based on discrete search. The result of applying the method to an algorithm is 
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that its running time becomes independent of the size of the database, while the decisions 
made are essentially identical to those that would be made given infinite data. The method 
works within pre-specified memory limits and, as long as the data is iid, only requires 
accessing it sequentially. It gives anytime resu ... 

Keywords: Bayesian networks, Hoeffding bounds, discrete search, scalable learning 
algorithms, subsampling 
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G. Robles-De-La-Torre, R. Sekuler 

October 2004 ACM Transactions on Applied Perception (TAP), volume l issue 2 

Full text available: ^ pdf(584.66 KB) Additional Information: full citation , abstract , references , index terms 

Precise manipulation of objects is ordinarily limited by visual, kinesthetic, motor, and 
cognitive factors. Specially designed virtual objects and tasks minimize such limitations, 
making it possible to isolate and estimate the internal model that guides subjects 1 
performance. Subjects manipulated a computer-generated virtual object (vO), attempting 
to align vO to a target whose position changed randomly every 10 s. To analyze the control 
actions subjects use while manipulating the ... 

Keywords: Dynamics, human cognition, human information processing, ideal performer, 
internal model, virtual object, virtual reality 
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Gert R. G. Lanckriet, Nello Cristianini, Peter Bartlett, Laurent El Ghaoui, Michael I. Jordan 

August 2004 The Journal of Machine Learning Research, volume 5 

Full text available: ^ pdf(467.50 KB) Additional Information: full citation , abstract , index terms 

Kernel-based learning algorithms work by embedding the data into a Euclidean space, and 
then searching for linear relations among the embedded data points. The embedding is 
performed implicitly, by specifying the inner products between each pair of points in the 
embedding space. This information is contained in the so-called kernel matrix, a symmetric 
and positive semidefinite matrix that encodes the relative positions of all points. Specifying 
this matrix amounts to specifying the geometry of t ... 

58 World Wide Web: Mining the web for answers to natural language questions Q 
Dragomir R. Radev, Hong Qi, Zhiping Zheng, Sasha Blair-Goldensohn, Zhu Zhang, Weiguo Fan, 
John Prager 

October 2001 Proceedings of the tenth international conference on Information and 
knowledge management 
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^ terms 

The web is now becoming one of the largest information and knowledge repositories. Many 
large scale search engines (Google, Fast, Northern Light, etc.) have emerged to help users 
find information. In this paper, we study how we can effectively use these existing search 
engines to mine the Web and discover the "correct" answers to factual natural language 
questions. We propose a probabilistic algorithm called QASM (Question Answering using 
Statistical Models) that learns the best query para ... 

59 S pecial issue on ICML: Policy search using paired comparisons |j| 
Malcolm J. A. Strens, Andrew W. Moore 
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Direct policy search is a practical way to solve reinforcement learning (RL) problems, 
involving continuous state and action spaces. The goal becomes finding policy parameters 
that maximize a noisy objective function. The Pegasus method converts this stochastic 
optimization problem into a deterministic one, by using fixed start states and fixed random 
number sequences for comparing policies (Ng and Jordan, 2000). We evaluate Pegasus, 
and new paired comparison methods, using the mountain car probl ... 

60 Bioinformatics (BIO): Identification of fundamental building blocks in protein sequences Q 

usin g statistical association measures 
Deborah Weisser, Judith Klein-Seetharaman 

March 2004 Proceedings of the 2004 ACM symposium on Applied computing 

Full text available: *Qpdf (650.19 KB) Additional Information: full citation , abstract , references , index terms 

Protein sequence data is abundant, yet derivation of structural features from sequence 
alone is generally restricted to prediction of domain architecture, secondary structure 
elements and motifs. Precise feature boundaries cannot be determined reliably, and it is 
unknown to what extent these features constitute fundamental building blocks of protein 
sequences, a question with particular relevance to protein folding. Here we propose a 
statistical approach using mutual information, a measure of as ... 

Keywords: G-protein coupled receptors, feature prediction, mutual information, rhodopsin 
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Full text available: fl pdf(5.28 MB) 
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VLSI cell placement problem is known to be NP complete. A wide repertoire of heuristic 
algorithms exists in the literature for efficiently arranging the logic cells on a VLSI chip. The 
objective of this paper is to present a comprehensive survey of the various cell placement 
techniques, with emphasis on standard cell and macro placement. Five major algorithms for 
placement are discussed: simulated annealing, force-directed placement, min-cut 
placement, placement by numerical optimization, a ... 

Keywords: VLSI, floor planning, force-directed placement, gate array, genetic algorithm, 
integrated circuits, layout, min-cut, physical design, placement, simulated annealing, 
standard cell 
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Xiaoyue Zhao, Haiyan Huang, Terence P. Speed 

March 2004 Proceedings of the eighth annual international conference on 
Computational molecular biology 

Full text available: ^ pdf(208.95 KB) Additional Information: full citation , abstract , references , index terms 

Many short DNA motifs such as transcription factor binding sites (TFBS) and splice sites 
exhibit strong local as well as non-local dependence. We introduce permuted variable length 
Markov models (PVLMM) which could capture the potentially important dependencies among 
positions, and apply them to the problem of detecting splice and TFB sites. They have been 
satisfactory from the viewpoint of prediction performance, and also give ready biological 
interpretations of the sequence dependence observed ... s 

Keywords: Jeffreys mixture, maximal dependence decomposition, model selection, 
permuted variable length Markov models, splice sites, transcription factor binding sites, 
weight matrix models 
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44 Manufacturing applications: Simulation optimization in manufacturing analysis: a 
simulation-optimization approach usin g genetic search for supplier selection 
Hongwei Ding, Lyes Benyoucef, Xiaolan Xie 

December 2003 Proceedings of the 35th conference on Winter simulation: driving 
innovation 

Full text available: ^ pdf (339.90 KB) Additional Information: full citation , abstract , references 

The paper presents a simulation-optimization approach using genetic algorithm to the 
supplier selection problem. The problem consists in selecting a portfolio of suppliers from a 
set of pre-selected candidates. The supplier selection is a multi-criteria problem that 
includes both qualitative and quantitative criteria. In order to select the best suppliers it is 
crucial to make a trade off between these tangible and intangible criteria, some of which 
may be contradictory. The proposed approach ... 

45 Using g enetic al gorithms to inductively reason with cases in the legal domain 
Anandeep S. Pannu 

May 1995 Proceedings of the fifth international conference on Artificial intelligence 
and law 

Full text available: ^ pdf(972.17 KB) Additional Information: full citation , references , citings , index terms 



46 A maximum entropy approach to species distribution modeling | 
Steven J. Phillips, Miroslav Dudfk, Robert E. Schapire 

July 2004 Twenty-first international conference on Machine learning 

Full text available: ^ pdf(1 63.78 KB) Additional Information: full citation , abstract , references 

We study the problem of modeling species geographic distributions, a critical problem in 
conservation biology. We propose the use of maximum-entropy techniques for this problem, 
specifically, sequential-update algorithms that can handle a very large number of features. 
We describe experiments comparing maxent with a standard distribution-modeling tool, 
called GARP, on a dataset containing observation data for North American breeding birds. 
We also study how well maxent performs as a function of ... 

47 Evolutionary computing and optimization: Issues in parallelizing multiob j ective | 
evolutionary algorithms for real world a p plications 

David A. Van Veldhuizen, Jesse B. Zydallis, Gary B. Lamont 

March 2002 Proceedings of the 2002 ACM symposium on Applied computing 

Full text available: ^ pdf(862.18 KB) Additional Information: full citation , abstract , references , index terms 

The concepts of efficiency and effectiveness must be addressed in conducting research into 
using a Evolutionary Algorithm (EA) for optimization problems. The increased use of 
evolutionary approaches for real-world applications, containing multiple objectives and high 
dimensionality, has led to the design and generation of a number of Multiobjective 
Evolutionary Algorithms (MOEA). When analyzing these algorithms, the issues of 
effectiveness and efficiency are extremely important and typically dri ... 

Keywords: multiobjective evolutionary algorithm, parallel algorithms 
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48 Combining phylogenetic and hidden Markov models in biosequence analysis 
Adam Siepel, David Haussler 

April 2003 Proceedings of the seventh annual international conference on 
Computational molecular biology 

Full text available: tiB pdf(294.94 KB) Additional Information: full citation, abstract, references, citings, index 

' terms 

A few models have appeared in recent years that consider not only the way substitutions 
occur through evolutionary history at each site of a genome, but also the way the process 
changes from one site to the next. These models combine phylogenetic models of molecular 
evolution, which apply to individual sites, and hidden Markov models, which allow for 
changes from site to site. Besides improving the realism of ordinary phylogenetic models, 
they are potentially very powerful tools for inference an ... 

Keywords: context-sensitive substitution, maximum likelihood 
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Tom Burr, James R. Gattiker, Greggory S. LaBerge 
July 2001 ACM SIGKDD Explorations Newsletter, volume 3 issue l 

Full text available: ^ pdf(984.40 KB) Additional Information: full citation , abstract , references 

In this paper we (1) describe state-of-the-art methods to identify clusters in DNA sequence 
data for taxonomic analysis; (2) describe a new method with better scaling properties 
based on model-based clustering, and (3) present examples using the nucleoprotein and 
hemagglutin regions of influenza and the env and gag regions of human immunodeficiency 
virus (HIV). 

Keywords: DNA sequence analysis, HIV, influenza, model-based clustering, phylogenetic 
trees 
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51 A pplyin g online gradient descent search to genetic pro gramming for object recognition Q 
Will Smart, Mengjie Zhang 

January 2004 Proceedings of the second workshop on Australasian information 
security, Data Mining and Web Intelligence, and Software 
Internationalisation - Volume 32 

Full text available: ^ pdf(207.11 KB) Additional Information: full citation , abstract , references 

This paper describes an approach to the use of gradient descent search in genetic 
programming (GP) for object classification problems. In this approach, pixel statistics are 
used to form the feature terminals and a random generator produces numeric terminals. 
The four arithmetic operators and a conditional operator form the function set and the 
classification accuracy is used as the fitness function. In particular, gradient descent search 
is introduced to the GP mechanism and is embedded into th ... 

Keywords: data mining, genetic programming, machine learning, object classification 



H 



H 




http://portal.acm.org/resultsxfin?queiy^a 11/27/04 



Results (page 3): training and ("evaluation function" or "fitness function") and statistical and model and ... Page 4 of 6 



52 Evaluation of prediction models for marketing campaigns 

Saharon Rosset, Einat Neumann, Uri Eick, Nurit Vatnik, Izhak Idan 
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Knowledge discovery and data mining 
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We consider prediction-model evaluation in the context of marketing-campaign planning. In 
order to evaluate and compare models with specific campaign objectives in mind, we need 
to concentrate our attention on the appropriate evaluation-criteria. These should portray 
the model's ability to score accurately and to identify the relevant target population. In this 
paper we discuss some applicable model-evaluation and selection criteria, their relevance 
for campaign planning, their robustness under ... 

Keywords: Confidence Intervals, Marketing Campaigns, Model Evaluation, Performance 
Measures 
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Rajeshwar Prasad Srivastava 

April 1992 Proceedings of the 1992 ACM annual conference on Communications 

Full text available: ^ pdf(494.85 KB) Additional Information: full citation, abstract , references , index terms 

This paper presents a method to optimize proportional-integral-derivative (PID) control 
parameters, given a discrete model of the controlled process. This method is based on 
Holland's genetic algorithm (GA). It does not require a mathematical model of the controller 
to represent its dynamic behavior. It gives a solution that is not only optimal but also meets 
engineering constraints. Genetic algorithms do a global search without derivatives for points 
in a multi-dimensional search space. Th ... 

54 Meta optimization: improving compiler heuristics with machine learning | 
Mark Stephenson, Saman Amarasinghe, Martin Martin, Una-May O'Reilly 

May 2003 ACM SIGPLAN Notices , Proceedings of the ACM SIGPLAN 2003 conference 

on Programming language design and implementation, volume 38 issue 5 
Full text available: fjj pdf(302.23 KB ) Additional Information: fullcitation , abstract, references, cifings. index 

Compiler writers have crafted many heuristics over the years to approximately solve NP- 
hard problems efficiently. Finding a heuristic that performs well on a broad range of 
applications is a tedious and difficult process. This paper introduces Meta Optimization, a 
methodology for automatically fine-tuning compiler heuristics. Meta Optimization uses 
machine-learning techniques to automatically search the space of compiler heuristics. Our 
techniques reduce compiler design complexity by relieving c ... 

Keywords: compiler heuristics, genetic programming, machine learning, priority functions 
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In this paper we propose a scaling-up method that is applicable to essentially any induction 
algorithm based on discrete search. The result of applying the method to an algorithm is 
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that its running time becomes independent of the size of the database, while the decisions 
made are essentially identical to those that would be made given infinite data. The method 
works within pre-specified memory limits and, as long as the data is iid, only requires 
accessing it sequentially. It gives anytime resu ... 

Keywords: Bayesian networks, Hoeffding bounds, discrete search, scalable learning 
algorithms, subsampling 
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October 2004 ACM Transactions on Applied Perception (TAP), volume l issue 2 
Full text available: ^ pdf(584.66 KB ) Additional Information: full citation , abstract, references , index terms 

Precise manipulation of objects is ordinarily limited by visual, kinesthetic, motor, and 
cognitive factors. Specially designed virtual objects and tasks minimize such limitations, 
making it possible to isolate and estimate the internal model that guides subjects' 
performance. Subjects manipulated a computer-generated virtual object (vO), attempting 
to align vO to a target whose position changed randomly every 10 s. To analyze the control 
actions subjects use while manipulating the ... 

Keywords: Dynamics, human cognition, human information processing, ideal performer, 
internal model, virtual object, virtual reality 
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Full text available: ^ pdf(467.50 KB) Additional Information: full citation , abstract , index terms 

Kernel-based learning algorithms work by embedding the data into a Euclidean space, and 
then searching for linear relations among the embedded data points. The embedding is 
performed implicitly, by specifying the inner products between each pair of points in the 
embedding space. This information is contained in the so-called kernel matrix, a symmetric 
and positive semidefinite matrix that encodes the relative positions of all points. Specifying 
this matrix amounts to specifying the geometry of t ... 

58 World Wide Web: Mining the web for answers to natural language questions Q 
Dragomir R. Radev, Hong Qi, Zhiping Zheng, Sasha Blair-Goldensohn, Zhu Zhang, Weiguo Fan, 
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The web is now becoming one of the largest information and knowledge repositories. Many 
large scale search engines (Google, Fast, Northern Light, etc.) have emerged to help users 
find information. In this paper, we study how we can effectively use these existing search 
engines to mine the Web and discover the "correct" answers to factual natural language 
questions. We propose a probabilistic algorithm called QASM (Question Answering using 
Statistical Models) that learns the best query para ... 

59 S pecial issue on ICML: Policy search using paired comparisons Q 
Malcolm J. A. Strens, Andrew W. Moore 

March 2003 The Journal of Machine Learning Research, volume 3 
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Direct policy search is a practical way to solve reinforcement learning (RL) problems 
involving continuous state and action spaces. The goal becomes finding policy parameters 
that maximize a noisy objective function. The Pegasus method converts this stochastic 
optimization problem into a deterministic one, by using fixed start states and fixed random 
number sequences for comparing policies (Ng and Jordan, 2000). We evaluate Pegasus, 
and new paired comparison methods, using the mountain car probl ... 

60 Bioinformatics (BIO): Identification of fundamental building blocks in protein sequences Q 
using statistical association measures 
Deborah Weisser, Judith Klein-Seetharaman 

March 2004 Proceedings of the 2004 ACM symposium on Applied computing 

Full text available: ^ pdf(650.19 KB) Additional Information: full citation , abstract , references, index terms 

Protein sequence data is abundant, yet derivation of structural features from sequence 
alone is generally restricted to prediction of domain architecture, secondary structure 
elements and motifs. Precise feature boundaries cannot be determined reliably, and it is 
unknown to what extent these features constitute fundamental building blocks of protein 
sequences, a question with particular relevance to protein folding. Here we propose a 
statistical approach using mutual information, a measure of as ... 

Keywords: G-protein coupled receptors, feature prediction, mutual information, rhodopsin 
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